AITopics | subquadratic high-dimensional hierarchical clustering

Collaborating Authors

subquadratic high-dimensional hierarchical clustering

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Subquadratic High-Dimensional Hierarchical Clustering

Neural Information Processing SystemsDec-26-2025, 01:08:01 GMT

We consider the widely-used average-linkage, single-linkage, and Ward's methods for computing hierarchical clusterings of high-dimensional Euclidean inputs. It is easy to show that there is no efficient implementation of these algorithms in high dimensional Euclidean space since it implicitly requires to solve the closest pair problem, a notoriously difficult problem. However, how fast can these algorithms be implemented if we allow approximation? More precisely: these algorithms successively merge the clusters that are at closest average (for average-linkage), minimum distance (for single-linkage), or inducing the least sum-of-square error (for Ward's). We ask whether one could obtain a significant running-time improvement if the algorithm can merge $\gamma$-approximate closest clusters (namely, clusters that are at distance (average, minimum, or sum-of-square error) at most $\gamma$ times the distance of the closest clusters). We show that one can indeed take advantage of the relaxation and compute the approximate hierarchical clustering tree using $\widetilde{O}(n)$ $\gamma$-approximate nearest neighbor queries.

algorithm, name change, subquadratic high-dimensional hierarchical clustering, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.82)

Add feedback

Reviews: Subquadratic High-Dimensional Hierarchical Clustering

Neural Information Processing SystemsJan-27-2025, 11:27:33 GMT

This paper proposes a new approach to approximating hierarchical agglomerative clustering (HAC) by requiring that at each round, only a gamma-best merge be performed (gamma being the multiplicative approximation factor to the closest pair). Two algorithms are introduced to approximate HAC - one for Ward and one for Average linkage. In both cases, the algorithms rely on using approximate nearest neighbor ANN as a black box. In addition, a bucketing datastructure is used in Wards algorithm and a subsampling procedure in used for Average linkage to guarantee the subquadratic runtime. This is a new contribution to the theoretical literature on HAC, a provable subquadratic algorithm for (an approximation to) HAC cases other than single linkage.

algorithm, average linkage, subquadratic high-dimensional hierarchical clustering, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.73)

Add feedback

Reviews: Subquadratic High-Dimensional Hierarchical Clustering

Neural Information Processing SystemsJan-27-2025, 11:11:56 GMT

This paper was very much a borderline paper, with two accept scores and one reject score. One of the concerns raised by the negative reviewer was that, while the algorithm can achieve an approximation to the best merge at each step, it is unclear how the final clustering results would compare to the standard algorithm. The authors addressed this in their rebuttal, which helped. Also, there were some issues raised about experiments as well as various minor suggestions (typos etc.). In general it seems that the concerns are mostly minor, and on the whole this paper seems to make an interesting and worthwhile contribution, so I am recommending that the paper is accepted.

algorithm, subquadratic high-dimensional hierarchical clustering

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.40)

Add feedback

Subquadratic High-Dimensional Hierarchical Clustering

Neural Information Processing SystemsOct-11-2024, 01:43:56 GMT

We consider the widely-used average-linkage, single-linkage, and Ward's methods for computing hierarchical clusterings of high-dimensional Euclidean inputs. It is easy to show that there is no efficient implementation of these algorithms in high dimensional Euclidean space since it implicitly requires to solve the closest pair problem, a notoriously difficult problem. However, how fast can these algorithms be implemented if we allow approximation? More precisely: these algorithms successively merge the clusters that are at closest average (for average-linkage), minimum distance (for single-linkage), or inducing the least sum-of-square error (for Ward's). We ask whether one could obtain a significant running-time improvement if the algorithm can merge \gamma -approximate closest clusters (namely, clusters that are at distance (average, minimum, or sum-of-square error) at most \gamma times the distance of the closest clusters).

algorithm, subquadratic high-dimensional hierarchical clustering, sum-of-square error, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)

Add feedback

Subquadratic High-Dimensional Hierarchical Clustering

Abboud, Amir, Cohen-Addad, Vincent, Houdrouge, Hussein

Neural Information Processing SystemsMar-19-2020, 01:17:53 GMT

algorithm, gamma, subquadratic high-dimensional hierarchical clustering, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Add feedback